feat(entity-store-perf): parallel bulk ingest, ingest-rate, duration mode, and entity poll fixes#383
Open
gurevichdmitry wants to merge 5 commits into
Open
feat(entity-store-perf): parallel bulk ingest, ingest-rate, duration mode, and entity poll fixes#383gurevichdmitry wants to merge 5 commits into
gurevichdmitry wants to merge 5 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR enhances the entity_store_perf upload commands to better support sustained load/performance testing by increasing ingest throughput, adding optional throttling and duration-based execution, and making post-upload entity polling more reliable (with timeouts and clearer progress).
Changes:
- Add configurable parallel bulk ingest (
--bulk-concurrency) and optional per-upload ingest throttling (--ingest-rate), plus achieved ingest-rate logging. - Add duration-based interval runs (
--duration, mutually exclusive with--count) and refactor the interval loop into a shared helper. - Improve post-upload entity polling with timeout support,
match_allcounting when--deleteDatais used, and progress/status logs; update docs accordingly.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/commands/shared/elasticsearch.ts | Adds optional pipeline to bulkUpsert, extends streamingBulkIngest with defaults + concurrency/pipeline support. |
| src/commands/entity_store_perf/index.ts | Adds CLI flags for bulk concurrency, ingest rate, and duration; updates argument handling/validation. |
| src/commands/entity_store_perf/entity_store_perf.ts | Implements new upload batching + parallel bulk indexing, ingest-rate throttling, duration mode, and improved entity polling logic. |
| src/commands/entity_store_perf/README.md | Documents new flags (--bulk-concurrency, --ingest-rate, --duration) and updated interval/polling behavior. |
3b48f79 to
3bea6b8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Improves
upload-perf-data-interval(and singleupload-perf-data) for sustained load testing: higher log ingest throughput, optional rate limiting, duration-based runs, and reliable post-upload entity polling.Changes
_bulkrequests (pMap, default 8 concurrency,pipeline=_none,refresh=false) instead of a singlehelpers.bulkstream.--bulk-concurrency: CLI flag to tune parallel bulks (default8; values above 8 did not improve throughput on our test cluster).--ingest-rate: Optional max docs/sec per upload (throttle pauses between batches). Logs achieved rate and notes when the cluster, not the flag, is the bottleneck.--duration: Wall-clock interval mode (mutually exclusive with--count); keeps uploading until the deadline with--intervalpauses between passes.entity_store_perf/README.mdfor new flags and behavior.yarn start upload-perf-data-interval medium --deleteData --noTransforms --interval 2 --duration 5m --ingest-rate 25000 --bulk-concurrency 8